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REMARKS 

After entry of the foregoing amendments, claims 6 to 13, 16, 17, 19-22, 24 to 26 and 28 
to 32 will remain pending. Claims 14, 15, 18 and 23 are canceled herein, without prejudice to 
pursuit in one or more continuing applications. Claims 6-8, 25, 26, and 28-31 are currently 
under examination and stand rejected under 35 U.S.C § 103 as allegedly obvious over the 
combined teachings of the DOCK 4.0 User's Guide (1998) as further evidenced by Kuntz 
(Science (1992), vol. 257, pages 1078-1082) ("Kuntz") in view of Takasaki et al. (Nature 
Biotechnology (1997) Vol. 15: pages 1266-1270) ("Takasaki") and further in view of Tang et al, 
(Chemistry and Biology (1997) Vol. 4, pages 453-459) ("Tang"). 

Applicants respectfully disagree that the cited references establish the prima facie 
obviousness of any of the previously pending claims. However, in the interest of advancing 
prosecution of this application, Applicants have amended the claims to more clearly define the 
claimed invention, and in the process further distinguish over the prior art. In particular, the 
claims have been amended to recite that the target protein is a membrane-bound protein, a 
cytosolic protein, a nuclear protein, a cytokine, a lymphokine, a chemokine, an adhesion 
molecule, a growth factor, or a receptor thereof and that the modifier is a proteinaceous 
modifier. Support for these amendments may be found in the application as filed, for example, 
in the paragraph bridging pages 6 and 7 (target proteins) and in the first full paragraph on page 7 
(proteinaceous modifiers). 1 No new matter is added. 

Entry of these amendments after final is respectfully requested because the amendments 
would not require further searching and place the application in condition for allowance. 



1 The claims have also been amended to delete the term "small molecule," which had been inserted 

in their previous amendment. Applicants had inserted this term intending that it be given its usual 
meaning as would be ascribed by those skilled in the art of pharmacology and biochemistry, i.e., an 
organic compound that is not a polymer, which would thereby exclude compounds such as nucleic acids, 
proteins and polysaccharides. (See e.g., http://en. wikipedia.or g /wiki/Small molecule .) In view of the 
Office's position that the term is not interpreted to exclude ribosomal constructs (Office Action at page 
7), Applicants have deleted the term from the claims. 
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Rejection Under 35 U.S.C. § 103 

Applicants respectfully submit that amended claims clearly define over the references 
cited in the Office Action dated May 19, 2009. 

The claims are directed to methods of identifying allosteric modulators of intermolecular 
interactions between certain targets proteins and proteinaceous modifiers. As the Examiner will 
understand, the term "allosteric modulator" in this context means a compound that will act on a 
protein at a site that is distinct from the site that is critical to the functionally critical site involved 
in the intermolecular interaction. As further amended herein, the claims specify that the 
allosteric modulator is a compound that contains at least one functional group that can be 
accommodated by an allosteric cavity that is within 15 to 20 angstroms from the functionally 
critical site. Through practice of the claimed methods, compounds may be identified that will 
interact with the target protein at the allosteric site, and as a result modulate the intermolecular 
interaction that may occur between the target protein and a proteinaceous modifier at the nearby 
{i.e., within about 15-20 angstroms) functionally critical site. Importantly, these methods allow 
the identification, for example, of small molecules that non-competitively inhibit various protein- 
protein interactions. 

Applicants respectfully submit that the claimed methods are neither taught nor suggested 
by any proper combination of the DOCK User's Guide (as evidenced by Kuntz), Takasaki and 
Tang. 

The DOCK User's Guide fails to mention any allosteric cavity. The Office Action 
addresses this deficiency by combining the User's Guide with the Kuntz reference, stating that 
"sites that are found by the program include active regions of enzymes, recognition and allosteric 
features." (Office Action at page 4.) It should be noted, however, that it is not at all clear that 
this reference in Kuntz to "allosteric features" should be interpreted as a reference to "allosteric 
cavities" that can be used to identify allosteric modulators of intermolecular interactions at the 
functionally critical site of the target protein. Indeed, the Kuntz article at this point cites to 
reference (45), which is an earlier article by Kuntz, et al. (copy enclosed). This earlier article at 
most suggests that the DOCK program can be used to identify and explore "various pockets, and 
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packing defects for protein surfaces" (see page 287, final paragraph) but provides no useful 
information about which "pocket" (if any) may be relevant for identifying allosteric modulators. 

The Office Action further takes the position that "the DOCK program also measures 
distances, making it obvious to measure cavities within 15 to 20 angstroms of a critical site." 
(Office Action at page 4.) Applicants disagree. The DOCK User's Guide and the Kuntz article 
provide NO guidance as to which of the multitude of pockets, cavities, packing defects, or other 
features of the protein surface should be explored for further evaluation as a site that may be 
useful for allosteric modulation of interactions at a separate and distinct functionally critical site. 
Applicants respectfully submit that it is error to equate the ability to measure with a teaching or 
suggestion to map the features of a cavity within a specified measurable distance. Applicants 
respectfully submit that there is nothing in the DOCK User's Guide, as evidenced by the Kuntz 
reference, that would teach or suggest to those of ordinary skill in the art that one seeking to 
identify sites for allosteric modulation should map a cavity that is within about 15 to 20 
angstroms of the functionally critical site on the target protein, as recited in the pending claims. 

Applicants fail to see how the addition of Takasaki does anything to overcome the 
deficiencies of the DOCK User's Guide and the Kuntz article. While the Examiner is correct 
that the reference identifies three critical binding sites on the molecule, the reference says 
nothing about allosteric modulation of binding at any of these sites. As noted in Applicants' 
prior responses, the authors of the Takasaki reference designed peptidomimetics having a similar 
shape to this critical site and an anti-TNFa monoclonal antibody, which would have the ability to 
bind TNFa, and tested whether the peptidometic prevented TNFa binding and activation of the 
receptor. Thus, while the Takasaki reference may describe the identification of critical binding 
sites within the TNF receptor, it says nothing about allosteric modulation of such a site. Thus, 
there is nothing in this reference that, when combined with the DOCK User's Guide, even as 
may be augmented by the Kuntz reference, would suggest the methods of the present invention. 

The Tang reference also does nothing to overcome the deficiencies of the other 
references. Tang is directed to methods for engineering ribosomal constructs that may be used as 
allosteric modulators of interactions between protein enzymes and their substrates. As amended 
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herein, the instant claims are directed to certain types of protein-protein interactions that DO 
NOT include interactions between protein enzymes and their substrates. Thus one of ordinary 
skill in the art would have had no reason to consult the Tang reference. 

Indeed, none of the references cited in the Office Action say anything at all about 
methods for developing allosteric modulators of intermolecular interaction between a target 
protein and a proteinaceous modifier, wherein said target protein is a membrane-bound protein, a 
cytosolic protein, a nuclear protein, a cytokine, a lymphokine, a chemokine, an adhesion 
molecule, a growth factor, or a receptor thereof. The Office has failed to identify any references 
that describe methods for identifying allosteric modulators of the sort of protein-protein 
interactions specified in the instant claims. 

Applicants submit, therefore that no proper combination of the cited references in any 
way establishes the prima facie obviousness of the claimed invention. In addition, Applicants 
note that there is further nothing in any of the cited art that would lead one of ordinary skill in the 
art to use nuclear magnetic resonance, crystal structure analysis, calorimetric values from 
thermodynamic studies or computer modeling (as recited in claim 25), in addition to distance 
from a functionally critical site, to determine which of the thousands of cavities that may be 
present within a given target protein should be identified as a cavity that is likely to induce an 
allosteric effect. Similarly, the Office has failed to identify any prior art that teaches or suggests 
the use of thermal P-factors, as recited in claim 3 1 , to identify a cavity that is suitable for 
exploration as a site for inducing allosteric modulation. Thus, the Office has once again failed 
completely to indicate how these additional claim elements are rendered obvious by the prior art. 
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CONCLUSION 

For these reasons, Applicants respectfully submit that the pending claims, as amended 
herein, are nonobvious over the references cited in the Office Action. Applicants request, 
therefore, that the allowability of claims 6-8, 25, 26, and 28-31 be acknowledged. Withdrawal of 
the pending rejections, rejoinder and favorable consideration of the withdrawn claims, and an 
early notice of allowance of all of pending claims 6 to 13, 16, 17, 19-22, 24 to 26 and 28 to 32 
are earnestly solicited. 



Date: June 30, 2009 /S. Maurice Valla/ 

S. Maurice Valla 
Registration No. 43,966 

Woodcock Washburn LLP 
Cira Centre 

2929 Arch Street, 12th Floor 
Philadelphia, PA 19104-2891 
Telephone: (215)568-3100 
Facsimile: (215) 568-3439 
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We describe a method to explore geometrically feasible alignments of ligands and 
receptors of known structure. Algorithms are presented that examine many 
binding geometries and evaluate them in terms of steric overlap. The procedure 
uses specific molecular conformations. A method is included for finding putative 
binding sites on a macromolecular surface. 

Results are reported for two systems : the heme-myoglobin interaction and the 
binding of thyroid hormone analogs to prealbumin. In each case the program finds 
structures within 1 A of the X-ray results and also finds distinctly different 
geometries that provide good steric fits. The approach seems well-suited for 
generating starting conformations for energy refinement programs and interactive 
computer graphics routines. 

1. Introduction 

To position two molecules so that they interact favorably with one another is a 
problem of general interest to chemists and biochemists. It is a problem of 
considerable difficulty for molecules of any complexity, because of the large 
number of internal degrees of freedom and the attendant local minima in the 
molecular conformation space. 

Our approach is to reduce the number of degrees of freedom using simplifying 
assumptions that still retain some correspondence to a situation of biochemical 
interest. Specifically, we treat the geometric (hard sphere) interactions of two rigid 
bodies, where one body (the "receptor") contains "pockets" or "grooves" that form 
binding sites for the second object, which we will call the "ligand". Our goal is to fix 
the six degrees of freedom (3 translations and 3 orientations) that determine the 
best relative positions of the two objects. 

Earlier studies of molecular docking (VVodak & Janin, 1978 ; Greer & Bush, 1978 : 
Lf -inthal et al., 1975; Salemme, 1976) made use of approximate potential functions 
for the intermolecular interactions and grid searches to fix the degrees of freedom. 
Here we use a very simple interaction function containing only two terms : hard 
sphere repulsions and "hydrogen bonding". A zero value for this function will 
correspond to a docking geometry having: (1) no hard sphere overlaps between 
269 
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receptor and ligand atoms ; (2) all hydrogen bonding atoms of the ligand with a 
nitrogen or oxygen atom of the receptor within 3-5 A; and (3) all ligand atoms 
within the receptor binding site. 

We expect, and find, that more than one jeceptor-ligand geometry can provi " 
low values of the interaction function. Thus, we extend the goals of the calculate 
to produce representatives of all geometrically feasible solutions. 

The degrees of freedom are fixed using distance-comparison techniques (Le 8 ij 
1979) derived from distance geometry methods (Crippen, 1981 ; Kuntz et al., 1979j 
and optimization procedures. The method also uses molecular surface calculations 
(Richards, 1977 ; Connolly, 1981). 

We should emphasize that the assumption of rigid molecules is a severe 
restriction, since it requires prior knowledge of the two structures to atomic detail 
Hence, these procedures should not be construed as a general solution to the 
docking problem. The methods described may be extended to include some internal 
degrees of freedom at a later date. We do not view the calculated structures as ends 
in themselves; rather, we see them as reasonable starting points for molecular 
mechanics (Weiner & Kollman, 1981; Potenzone et al, 1977 ; Hagler et al, 1974; 
Momany et al., 1974) and for interactive computer graphics studies (Langridge 
et al, 1981). 

We also note that some of the algorithms presented here have interesting 
independent applications such as: location and characterization of macro- 
molecular binding sites, conformational comparison among a series of compounds 
and design of new ligands. 

2. Methods 

(a) Approach 
We divide the problem into three parts. 

(1) Representation of the receptor and ligand structures, a process that includes 
identification of the possible binding sites on the receptor molecule. 

(2) Matching of the receptor and ligand representations. 

(3) Optimization of ligand position within the binding site. 
We outline the overall procedure and then describe the details. 

The representation program generates a set of spheres that fill all pockets and grooves on 
the surface of the receptor molecule. These spheres are collected into a number of 
presumptive "binding" sites: each site can then be examined independently for geometric 
matching with the ligand. The ligand molecule is also represented by a set of spheres that 
approximately fill the space occupied by the ligand. If the ligand provides a good match to 
the receptor site, the set of ligand spheres should, in some sense, fit within the set of receptor 
spheres. It is helpful to think of the "lock and key" analogy often used to describe enzyme- 
substrate interactions. The program produces a representation of the kev (the ligand 
spheres) and a representation of the key-hole (the receptor spheres). 

The pairing rule is based on a comparison of internal distances in both ligand and receptor 
A ligand sphere can be paired with a receptor sphere if each sphere belongs to a set of spheres 
with the following property: the internal distances of all the spheres in the ligand set must 
match all the internal distances within the receptor set, within some error limit on each 
distance. This rule allows the identification of geometrically similar clusters of spheres in the 
receptor site and in the ligand. without requiring explicit rotations of one structure onto the 
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The final stage in the program explores the suggested pairings. It carries out the rotation 
Pof * ne l'g an d spheres onto the corresponding receptor spheres using the least-squares 
■(•algorithm of Ferro & Hermans (1977). The rotation/translation matrix generated from the 
& spheres is then applied to all Hgand atoms. The optimization procedure manipulates the 
Bligand atom co-ordinates to reduce atom overlaps and ensure hydrogen-bonding partners. It 
■ also niakes an effort to place ligand atoms within receptor spheres. The output of this part of 
j&the program is a set of ligand structures expressed in the receptor molecule co-ordinate 
^ system. Each structure has a score based on the degree of overlap. Geometrically similar 
L structures are grouped together for display on a computer graphics system or for further 
ji processing. The entire procedure takes hours of computer time on a minicomputer such as a 
i PDP 11/70 (Digital Equipment Corporation). 



(b) Detailed algorithms 
' (i) Representation 

The guiding principle is to represent the ligand and the receptor site so that the two 
representations are identical in the limit of a "perfect" geometric fit. The usual atomic 
description of molecules can never achieve this goal since the ligand and receptor atoms are 
r coincident in their optimum geometry. The definition of molecular surface by 
Richards (1977) and a program to calculate this surface developed by Connolly (1981,1982) 
provide a useful starting point. In the limit of perfect fit, the ligand and receptor surfaces will 
be in exact correspondence within the receptor binding site. 

A brief description of the molecular surface is required. The surface, as implemented by 
Connolly's program, is a collection of points and the vectors normal to the surface at each 
point. The points fall into two classes, following Richards: contact points that lie on the van 
der Waals' surface of the solvent-accessible atoms and re-entrant points that lie on the 
inward-facing surface of a "probe" sphere. As Richards has emphasized, the surface and 
volume of a collection of atoms can only be defined completely with reference to a probe 
object of some form. A spherical probe is most commonly used. A probe of zero radius yields 
the van der Waals' surface of a molecule, whose volume is the sum of the atomic volumes and 
whose appearance is very similar to the normal "space-filling" atomic models in common 
use. A probe of infinite radius generates a solid polyhedron called the "convex hull" of the 
collection of atoms. It is the smallest object with no concavities that contains all the atoms. 
For the work presented here, we used a probe sphere of radius 1-4 A to approximate a water 
molecule. The results are not sensitive to the precise probe radius if the probe is small 
compared to the ligand molecule. 

The surface calculation also requires specification of all atomic radii (Table 1). Since 
hydrogen co-ordinates are rarely available from X-ray diffraction experiments on 
macromolecules, we use "united atom" radii that are somewhat larger than the usual van 
der Waals' radii. A straightforward refinement would use explicit hydrogen atoms and 
smaller heavy-atom radii. Such a refinement would not alter the major results of this paper. 

In principal, docking algorithms could be designed that use the molecular surfaces 
directly. The large number of points per surface cause several difficulties, so we developed a 
more compact representation. 

For a surface composed of n surface points, we construct a set of spheres with the following 
properties. 

(1) Each sphere touches the molecular surface at two points (i. j) and has its center on the 

surface normal from point i (Fig. 1 ). 
-) Each receptor sphere lies on the outside of the receptor surface. 
(.">) Each ligand sphere lies on the inside of the ligand surface. 

The co-ordinates of the centers of the spheres are found analytically. Special cases 
involving surface normals lying along the principal axes or symmetry-related points must be 
treated separately. 
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Table 1 

Atomic radii and overlap distances 
n der Wools' radii (A) (united atoms) 

Surface calculations Refinement calculations 



B. Overlap distances (A) 



N 2-5 2-5 3-5 35 3-0 28 

0 2-5 3-5 3-5 3-0 2-8 
C 3-5 3-5 4-0 3-8 
S 3-5 40 3-8 

1 4-5 4-35 
Br 4-25 




Fig. 1. Sphere generated tangent to surface points with center on surface normal at point i. 



We reduce the number of spheres in several ways. First, while n — 1 spheres can be 
constructed at each surface point, we are only concerned with the sphere of smallest radius, 
since all larger spheres at that point must intersect the molecular surface. Second, for each 
smallest sphere, we calculate the angle formed by the 2 surface points, i and j, and the sphere 
center (Fig. 1). We retain spheres with angles less than 90° since such spheres are more likely 
to span across an invagination than spheres with large angles, which tend to lie in shallow 
grooves. The exact choice of angular cut-off is not crucial. Two additional restrictions are 
applied to the receptor only. Receptor spheres are accepted only if points t and j belong to 
atoms in amino acid residues that are more than 4 apart in sequence. This eliminates the 
grooves formed by every alpha helix. We also eliminate receptor spheres with radii greater 
than 5 A. Inspection showed that such spheres extend out of the "top" of the binding 
pockets. Finally, we reduced the data storage and computation requirement by retaining 
only one sphere per atom. Specifically, we choose the largest sphere formed from the contact 
surface points of each receptor atom and the largest sphere formed from the re-entrant surface 
points of each ligand atom. While more spheres could be retained, this set appears sufficient 
for our purposes. 



ATTACHMENT 



MACROMOLECULE-LIGAND INTERACTIONS 273 

t- The result of these manipulations is a set of spheres for each molecule of interest (Fig. 2). 
V^e finish with no more than one sphere per atom. Each is characterized by its center co- 
ordinates, radius and internal angle (Fig. 1). ' 

|.At this point, the list of receptor spheres includes all the invaginations of the receptor 
surface. We separate the list into several possible binding sites with the rule : spheres belong 
lo t\-i same site if the spheres overlap. This procedure identifies a small number of sites 
scattered about the surface of a typical globular protein (Figs 3 and 4). In the proteins we 
have examined, the binding site is always the largest such feature. This approach should be 
useful in characterizing the binding sites of any macromolecule whose structure is available. 

The entire computation for generating these spherical representations is relatively rapid : a 
few minutes of PDP 11/70 time for ligands of 40 to 50 atoms and approx. 30 min for'small 




Fig. 2. Schematic representation of a small binding site-formed from five atoms (thin circles). The 
molecular surface is shown as the thick line. The two receptor spheres (thick circles) are constructed as 
described in the text, with their centers lying along the surface normals (arrows). 




Flo. 3. Cross-section of the molecular surfaces for myoglobin (green) and heme (red). Part of the »■ 
carbon chain is also shown. The most prominent feature is the E helix. 
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Fig. 4. Major pockets or invaginations in the myoglobin surface. The heme pocket (green) and 2 
regions formed by helical surfaces (violet, blue) are discussed in the text. 



globular proteins. The surface calculation, which must precede these efforts, takes 5 s/atom 
on the 1 1/70. 

(ii) Matching 

Given the set of spheres for a particular ligand and the set of spheres for one of the binding 
sites of the receptor, we must next find some method of alignment of the two representations. 
The major hurdle is that one cannot rely on having information that pairs specific ligand and 
receptor spheres. The two general approaches to the problem are grid searches or 
combinatorial matching. 

| One cannot try all possible combinations. For m ligand spheres and n receptor spheres 
there are n !/(« — m) ! possibilities. For n = 50 and m — 25 there are choices of the order of 
10 39 . While the vast majority of such arrangements would be geometrically impossible, the 
number is too large for even the simplest test strategy. There is a large redundancy here, 
since fixing 4 specific pairs is sufficient to determine the rigid docking. Even so, for moderate- 
sized ligands the number of possibilities is large. We feel it is desirable to retain as thorough a 
search as is computationally feasible. Our procedure is as follows : systematically pair each 
ligand sphere, i, with each receptor sphere, k. Consider the set of distances from i to all other 
ligand spheres j. d tj . and from k to all other receptor spheres /. d u . Assign a second pair (j = I) 
of spheres so that a maximum number of spheres obey the condition : 

abs(dij-d kl ) < e. 

where £ is a parameter that specifies the allowed deviation between the ligand and receptor 
internal distances. We find that a value for e between 1 and 2 A works well for the molecules 
in this study. 

Once the best second pair of ligand-receptor spheres has been identified, proceed to a third 
pair of spheres subject to the additional constraint that the distances from the new spheres 
to the previously assigned pairs must also obey the error check. This process continues until 
no further pairs can be assigned. If the number of assignable pairs is less than 4. the 
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orientation problem is underdetertnined and the match is rejected. Otherwise, the match is 
ictained for the refinement procedure. Lesk (19^9) has described a somewhat similar 
approach that also makes use of internal distance comparisons. 

Some caveats should be clearly stated. The matching procedure outlined here does not 
guiimntee a globally optimum solution. The process can require of the order of n* steps to 
run to completion. For the work here we have reduced the computation time by only 
accepting ligand and receptor spheres whose centers are approximately the same distance 
from the centroid of each respective representation. The allowed deviations were typically 1 
to 2 A. This restriction is equivalent to the assumption that the ligand fills the receptor 
pocket but does not extend much outside it. This turns out to be a good approximation for 
the systems studied in this paper but could not be used for a small ligand in a large pocket or 
a large ligand that extends far outside the receptor site. 

With the assumptions and parameters given above, the matching algorithm ran to 
completion in a few hours of computer timet . Typically, several hundred assignment lists 
were generated that met the criteria listed above. 

Tne output of this stage of the program consists of short lists of pairs of ligand and 
receptor sphere numbers. These collections of spheres have been generated to have all 
internal distances matched within e. The lists require further processing for 3 reasons. 

(1) Internal distances retain an ambiguity in handedness that must be resolved. 

(2) The unmatched ligand spheres (those not on a particular list) may be forced to lie in 
totally unacceptable locations to achieve the geometric requirements of the list 
assignments. 

(3) We need to establish a co-ordinate transformation to locate the ligand with respect to 
the receptor. 

These questions are resolved in the final stage of the program. 

(in) Optimization routines 

The first task is to convert the list assignments into Cartesian co-ordinates. Then the 
ligand is moved somewhat to improve its "fit" according to the criteria mentioned earlier: 
minimization of overlap', hydrogen-bond pairing, and positioning of the ligand into the 
receptor binding site. 

The conversion into co-ordinates is straightforward. The ligand spheres of each list are 
translated and rotated onto their receptor sphere partners using the least-squares algorithm 
ORIENT of Ferro & Hermans (1977), which has proved robust and efficient for our 
purposes. The ORIENT routine returns a measure of how closely the 2 lists of spheres 
correspond to each other. If the correspondence is poor (that is, a least -squares error greater 
than 3 A) the program reverses the handedness of the set of ligand spheres and ORIENT is 
caned again. If this error is also large, the particular assignment set is rejected and the next 
one is tried. 

If the least-squares error from ORIENT is acceptable, the translation/rotation matrix 
from the procedure is applied to all ligand atoms. This process yields Cartesian co-ordinates 
for the ligand atoms referenced to the centroid of the receptor sphere cluster. 

The next step is to optimize the placement of the ligand. There are many options one could 
explore, including energy minimization and interactive computer graphics manipulation. 
We devised a few simple procedures to sort rapidly among the large number of possible 
structures generated in the matching routine. Our goal was not a full-scale optimization but 
rather a weeding out of implausible structures. 

e first compute an overlap error between all ligand atoms and all receptor atoms. 

Eo^^Ir-. + n-d*. (1) 

t The equivalent grid search must scan 3 orientation angles with approx. 100 steps per angle and 3 
translational motions that can be restricted, by the centroid assumption, to approx. 10 steps per axis. 
This grid contains 10* points, each point requiring calculations of approx, 10 3 distances to check for 
overlaps, too large a calculation for the minicomputer. 
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where r„ r k are the van der Waals' radii of the ligand and receptor atoms (Table 1) and d tt jg 
the interatomic distance. The sum is only taken for positive values. The overlap error is 
roughly proportional to a van der Waals' repulsion with a proportionality constant of 0-1 to " 
convert into kcal/mol for non-bonded methyl groups. ■» 

Next, we attempt to improve #overi»p by moving the ligand further into the receptor site ag 
follows. The receptor sphere closest to each ligand atom is found. If the sphere is within 5 A 
we use the receptor sphere center as a target for the ligand atom. This is repeated for all 
ligand atoms. Then the ORIENT routine is called and a new set of ligand co-ordinates is 
calculated by rotating the ligand atoms onto their target positions. If the overlap error is 
reduced, the process continues. 

Finally, we use a displacement procedure that further reduces the overlap error and 
establishes potential hydrogen-bonding patterns. A displacement is calculated for any ligand 
atom that violates the overlap constraint or for any polar ligand atom that is further than 
3'5 A from a receptor polar atom. The displacement is along the line of atom centers for the 2 
atoms involved in each violation. The magnitude of the displacement is just sufficient to 
remove the error. If these displacements were applied to the atoms directly, or as pseudo 
forces, they would result in distortion of the rigid ligand-molecule geometry. To avoid this 
problem, the displacements are treated as targets and the ORIENT routine is used to find 
the best rigid-body translation/rotation onto the desired locations. The process is reasonably 
convergent. It terminates inelegantly if the number of displacements becomes less than 4. 
An alternative method that uses functional optimization for rigid-body displacements has 
been described by Cox (1967). 

The result of the refinement procedure is a set of ligand co-ordinates that have been 
adjusted to a locally good fit to the restraints. In keeping with the general purpose of this 
study, no effort has been made to carry the process to complete convergence since we 
anticipate further refinement using molecular mechanics. Approximately 100 structures are 
produced in the two test cases described below. Each is scored using the overlap error. We 
also group the structures into classes whose members have atom to atom r.m.s.-f 
displacements of less than 1 A. This classification greatly aids inspection of the various 
docking arrangements. The refinement and classification programs require approximately 
5 min per structure on the PDP 11/70. 

3. Results 

Our primary purpose in this paper is to test the techniques described above. 
Specifically, we ask: does the program reproduce known ligand-receptor 
geometries ? If so, does it also provide alternative structures that are geometrically 
reasonable? To these ends, we have examined two systems for which the ligand- 
receptor geometry has been established by crystallographic means. The first is the 
heme-myoglobin interaction in metmyoglobin. The second, is the binding of 
thyroxine to prealbumin. We also discuss some preliminary results for the docking 
of modified thyroxines in prealbumin for which no direct structural data are 
currently available. 

(a) Myoglobin-heme binding 
We select this system for several reasons. The crystallographic results are 
unambiguous so that it affords an excellent control calculation. The heme group 
has very little internal flexibility. The high symmetry and the "flatness" of the 
heme provide a demanding test of the representation and matching parts of the 
algorithms. 

t Abbreviation used: r.m.s., root-mean-square. 
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We review the complete procedure and provide quantitative details. Two 
gurfaces are calculated at a density of five points/A 2 . The heme surface is obtained 
• from the heme co-ordinates taken directly from the myoglobin structure (Takano, 
- 1977). The receptor surface is simply that of the myoglobin with the heme atoms 
removed. No internal water molecules are included. The ligand and receptor 
surfaces are closely complementary in the heme pocket (Fig. 3), as one expects for a 
tight complex. These surfaces are the input data for the representation program. 
The heme group contains 43 non-hydrogen atoms, for which 408 re-entrant 
surface points were used to construct 42 spheres meeting the various constraints 
above. (The iron atom has no re-entrant surface when the heme surface is 
constructed using a 1-4 A probe.) Three minutes of computer time were required. 
The radii of the 42 spheres ranged from 1-75 to 215 A slightly larger, on average, 
than the van der Waals' radii of the various atoms with which they were associated. 
The sphere centers are displaced "inward" of the atom centers so that the ligand 
surface is well-filled (Fig. 5). The visual correspondence between heme surface and 
the envelope of the spheres is quite good. We should note that a perfect 
correspondence is not essential for the subsequent stages of the program. The 
important point is to have a sufficiently accurate set of internal distances so that 
the matching algorithm will function properlyf. The fit between the ligand surface 
and the surface of the spherical representation can be improved to almost any 
desired limit by increasing the number of surface points in the original surface and 
by increasing the number of spheres retained in the representation. The choice of 
one sphere/atom was for convenience, and proved to be of sufficient accuracy for 
thi-> study. 

Metmyoglobin, without the heme, contains 1217 non-hydrogen atoms. We used 
2289 contact surface points to construct 204 spheres. The small ratio of 
spheres/atoms reflects the lack of solvent accessibility of many myoglobin atoms 
and the more restrictive conditions for receptor sphere selection. This calculation 
took 40 minutes of 11/70 time. 

The metmyoglobin spheres fall into three major clusters of 54, 23 and 17 spheres 
and a number of smaller clusters (Fig. 4 and Table 2). The largest cluster is clearly 
the heme pocket. It is the only one large enough to accommodate a ligand of that 
size. The spherical radii range from 1-5 to 4-8 A, with the larger values associated 
with the top and "front" (E-F) face of the pocket. Closer inspection confirms that 
the spherical representation provides a good approximation to the pocket (Fig. 6). 
The only features that do not agree well are two substantial extensions along the 
outer surface that come from the large spheres for the side-chain oxygens of Gln91 
and Ser92. These could be removed without altering the results below. 

The second largest cluster is formed by the X terminal of the protein, the EF 
corner, and the side of the H helix. The third largest is a shallow invagination on 
the GH corner that continues some distance along the B helix/G helix interface 
U' 4). Some of the smaller "sites" will be discussed in more detail in a later paper. 

Returning to our main theme, the identification of the largest site with the heme 

t The more conventional choice of using interatomic distances for the ligand has the advantage of 
greater accuracy and numerical stability, but the interatomic distances for the receptor are not a 
suitable match set. 
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Flu. 5. Front (a) and side (b) views of the heme surface (red) filled with spheres (blue) calculated using 
eqn (1). 



pocket allowed us to use the matching algorithm with just the spheres for that site 
and those for the heme. The distance matching parameter, was 1-5 A. There are 
54 x 43 or 2322 choices for the initial pairing. Of these 573 were retained after 
screening the distances from the ligand and receptor centroids. All of these starting 
points provided at least six ligand-receptor pairs. The median number of entries 
was eight per list, with about 25° 0 of the arrangements having nine or ten entries. 
Remember that eight entries imply that all 8 x 7/2 or 28 internal distances among 
the eight ligand spheres and the 28 distances for the corresponding eight receptor 
spheres agree within the 15 A limit. 

The matching algorithm required about 30 seconds/structure for the heme 
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Table 2 
Myoglobin surface sites 



Site Spheres 




Atoms 


1 54 


B, C, CD, E, F, 


Leu32 CD2; Glu41 0 ; Lvs42 0 CG NE; 




FG, G, H 


Phe43 CD1 CE1 CE2 CZ; Asp44 N; 

Asp44 ODl ; Arg45 CD NHl NH2: 

Lys63 CE; His64 CE1 ; Thr67 0G1 CG2: 

Vat68 0 CGI CG2; Ala7l CB; 

Leu72 CA CD1 ; Leu89 CG CD1 ; 

Gln9I 0E1 ; Ser92 OG; His93 CE1 NE2: 

Thr95 CG2; Lys96 O NZ; 

His97 CB CG NDt CE1 NE2; 

Ile99 CGI CG2 CD1 ; 

Tvrl03 0 CDI OH; Leul04 CD2; 

Ilel07 CB CG2 CDI ; Ser]08 OG; 

ItelU CDI; Phel38 CE1 CZ; 

Leu 149 CG CD2 


■2 23 


NA. S. EF, F, H 


Vail N CG2; Leu2 0; Glu4 DEI ; 
Lvs79 0; Glv80 0; His81 CE1 NE2; 
G"lu83 CA 0 CB OE2; Lvs87 CE XZ; 
Gln91 NE2; Leul37 CD2; AspUl 0D2: 
Lvsl45 CG CD CE XZ; Glul48 CD OEl 


3 17 


B, C, G, H 


Arg31 NE NHl ; His36 CE1 NE2; 
Glul09CCB CG OEt; AlallO X; 
Hisll6 CG CD2 CEl NE2; Glvl24 CA; 
Alal2o N; Gin 128 OEl NE2 



problem. We found no way of deciding by simple inspection which of the 
assignment lists would yield geometrically reasonable structures. For example, the 
longest lists generally led to excessive overlap (see below). Thus we decided to 
retain all the assignment lists with six or more entries for processing by the 
refinement section of the program. 

Refinement 

Two parameters were evaluated for the matching lists. 
(1) The overlap error (eqn (1)). 

r-) The root-mean-square deviation from the heme co-ordinates in the original 
X-ray structure. 

Of course, the deviations from the crystal structure would not be available for an 
"unknown" but were calculated here to test the docking routines. The refinement 
program required approximately four minutes per structure for the myoglobin- 
heme system. The initial orientation phase reduced the overlap errors by 
approximately fourfold : the displacement phase of the routine made a roughly 
equivalent improvement. In Table 3 some of the final structures are listed in order 
of increasing overlap error. The structures fall into classes: the class with the 
>:nallest overlap error being the one closest to the original co-ordinates. Four 
structures were found that were within 1 A of the X-ray heme co-ordinates 
(class 1). All of these had small overlap errors (typically 0"2 A) for the van der 
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(b) 



Flu. 6. Front (a) and side (b) views of the spherical representation of the heme pocket (green) and the 
heme surface (orange) (see the text). 

Waals' radii we used. Xo attempt was made to refine these structures beyond the 
displacements described above. The overlap function (eqn (1)) had values of 16 to 
25 for these four structures compared with 20 for the X-ray co-ordinates. As noted 
earlier, these overlap values are equivalent to approximately 2 kcal/mol repulsive 
contribution. At somewhat higher values of the overlap function ( ~30) there is an 
interesting group of docking geometries (class 2), in which the heme is rotated 180 
degrees about an axis passing through CHA and CHC so that the propionic acid 
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Table 3 

Myoglobin-heme dockings arranged in order of overlap function 



Overlapt r.m.s.J Class§ 



16-8 0-50 1 ' 

21-6 0-80 1 

23-8 0-80 1 

243 1-02 1 

27-5 6-57 2 

275 6-56 2 

29-9 6-54 2 

29- 9 6-54 2 

30- 0 6-54 2 
30-9 6-54 2 
32-5 6-72 2 
34-3 1-36 1 
34-9 1-37 1 
36-3 5-35 3 
37 0 5-83 3 
39-0 4-67 3 
39-2 4-62 3 

39- 2 4-67 3 

40- 0 6-27 

40- 0 4-71 3 

41- 4 5-38 3 
41-5 4-84 3 
41-9 4-38 3 
43-0 5-34 3 
46-2 7-01 

48-3 4-77 3 

51-9 1-65 

531 515 3 

61-9 6-89 4 

63 4 6-33 

64- 0 6-96 4 

65- 2 6-54 

65-8 6-97 4 

67-2 7-47 

67-9 6-90 4 

70-4 6-97 4 

761 5-37 3 

80-7 6-99 4 



t As defined for eqn (1). The X-ray structure had an overlap value of 20. These values can be 
converted into conventional units of kcal/mol by multiplying by 01 (see the text). 
J r.m.s. co-ordinate error compared with heme co-ordinates from X-ray data. 

§ AH structures in a class are within 1 A r.m.s. co-ordinate error of another class member. Class I. X- 
ray, class 2. inverted; class 3 and 4. 90° rotation (see the text). 

side-chain positions are interchanged. These structures are really quite similar to 
the X-ray structure, although there are more close contacts between ligand and the 
h :rae pocket. There have been reports of such heme "inversion" in some insect 
hemoglobins (La Mar et al., 1981). The large r.m.s. co-ordinate errors arise from the 
large displacements of particular atoms during the heme rotation. Structurally, 
they are replaced with nearly identical atoms so that the net change in shape is 
quite small. At still larger values of the overlap parameter {E overlap > 35) are 
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structures that correspond to various 90 degree rotations of the heme, with 
without inversion (classes 3 and 4). These arise because of the high symmetry of the 
heme core but are of much less biochemical interest because of the awkward 
placement of the various heme side-chains. The large overlap values mark these* 
structures as poorer docking candidates. • £ 

In summary, the myoglobin-heme calculation confirms that the algorithms' 
produce a sampling of the reasonable geometries for this system. Some of the 
structural classes are quite far apart in conformation space. While we cannot claim ■* 
that the program has produced all structural classes of interest, it has certainly ■-. 
found the most obvious geometric variants, given the basic symmetry of the heme 
group. Of course, we make no claim that this approach could start with a proper 
apomyoglobin structure and proceed to the (presumably) significantly different 
myoglobin geometry. 

(b) Thyroxine-prealbumin 

The human serum thyroxine-transport protein prealbumin was the first 
hormone-binding protein to be characterized fully. Its atomic structure has been 
determined, and extensively refined, at 1-8 A resolution by X-ray analysis (Blake 
et al., 1978; S. J. Oatley, unpublished results). Prealbumin is a tetramer of identical 
subunits, each containing 127 residues; the subunits associate in an ellipsoidal 
shape forming a long central channel containing two hormone binding sites (Blake 
& Oatley, 1977). The symmetry of the molecule requires not only that the two sites 
are identical but also that each site itself has twofold symmetry. 

The binding of thyroxine to prealbumin has also been investigated at 1-8 A 
resolution (Blake et al., 1982; Blake, 1981). The interpretation and refinement of 
these data are still in progress (S. J. Oatley, J. M. Burridge & C. C. F. Blake, 
unpublished results). The initial difference electron -density map was dominated by 
the features corresponding to the electron-dense iodine atoms, although extensive 
small conformational changes in the protein were also evident. A preliminary 
interpretation of this map was obtained by adjusting the torsion angles of a 
thyroxine molecule (Cody, 1974) so that its iodine atoms best fitted their 
corresponding electron-density features (Blake et al., 1981). 

Thyroxine is oriented with its phenolic hydroxyl group buried deep within the 
binding channel and its carboxyl and amino groups ion-paired with the Lysl5 and 
Glu54 residues at the mouth of the binding channel. The positions of these charged 
residues in the native protein structure are such as to be in apparent close contact 
with thyroxine: they must be displaced on hormone binding, but their new 
positions were not clearly resolved in the electron-density maps. Accordingly, these 
close contacts were relieved by energy refinement of the complex (Blaney et al.. 
19826). allowing only the side-chains of Lyslo and Glu54 and the amino acid 
moiety of thyroxine to move from their X-ray co-ordinates. This provided the 
description of the binding site used as the starting point in our calculations. 

Thyroxine (T4) has 24 non-hydrogen atoms, each of which yields a sphere in 
the ligand representation. The prealbumin "site" we used contained only one of the 
two binding sites and required 43 spheres for its representation. As was seen for 
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myoglobin, the spherical representations, even at this level of approximation, are 
quite similar to the surfaces from which they ase derived. The matching program 
examined 360 potential docking arrangements. 

The refinement program provided three major groupings of structures (Table 4). 
The set with the least overlap was, curiously enough, an inverted form in which the 
amino acid moiety was innermost in the cavity and the phenolic ring was directed 
outward toward the solvent (class 1). Although this arrangement had been 
considered at an early stage of the crystallographic investigation (Blake & Oatley, 
1977), the present findings and the results of energy calculations indicate that this 

Table 4 

Thyroxine-prealbumin structures 



Overlapf r.m.a.| Class§ 



00 8-89 1 

14 716 1 

1- 4 819 1 

2- 7 919 1 

3- 6 6-71 1 
5-5 8-32 1 
91 7-55 1 

1 1-7 9-26 I 

13-8 7-29 1 

16-7 7-05 1 

19-3 5-84 1 

22-7 0-51 2 

22- 7 9-40 1 

23- 2 0-36 2 
23-6 0-37 2 
23-7 0-62 2 

23- 8 0-37 2 

24- 4 0-61 2 
250 0-70 2 

25- 2 8-22 1 

26- 6 9-24 1 
28-5 0-45 2 
30-3 0-97 2 
305 9 12 1 

30- 9 1-59 3 
311 1-45 2 
311 1-61 3 
311 716 1 

31- 3 0-94 2 
31-3 1-60 3 
31-4 1-60 3 

31- 5 1-60 3 

32- 2 1-58 3 
32-2 1-67 2 
32-6 1-58 2 



t As defined for eqn (1). The X-ray structure has an overlap value of 40 (see footnotes to Table 3 and 
text). 

} r.m.s. co-ordinate error compared with heme co-ordinates from X-ray data. 

§ All structures in a class are w ithin 1 A r.m.s. co-ordinate error of another class member. Class 1 . 
inside-out; class 2. X-ray: class 3. C-2 (see the text). 
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is not a favored conformation. The simple hydrogen-bond procedure we used did 
not rule out this class of structures because of ihe central Thr hydroxyl groups • a 
more realistic test that forced ion-pairing would have done so. The next group of 
structures, ranked by overlap, are quite close to the conformation we chose for a 
reference point (class 2). The overlap values range from 23 to 45 (not shown) with 
the reference structure having a value of 40, and are equivalent to 2 to 5 kcal/mol 
repulsive energy. The overlaps are primarily with the 1-3 and 1-5 iodines. These 
structures show a small (0-3 A) displacement inward compared to the starting 
structure, possibly because we omitted an internal water molecule that hydrogen 
bonds to the phenolic oxygen (Fig. 7). The third group of structures place the 
thyroxine at the equivalent C-2 position (class 3). The equivalence is not exact 
because the energy refinement procedures broke the crystallographic symmetry. 
The rest of the structures produced by the program had relatively poor overlap 
scores and were wedged in various ways into the mouth of the cavity. These are not 
listed in Table 4 and were not explored further. Thus, as with the myoglobin-heme 
example, the thyroxine calculation yields groups of structures that span 
geometrically reasonable possibilities. 

Visual inspection, using computer graphics, of some of the docking geometries 
suggested significant differences. Specifically, we noticed that the precise matching 
of the ligand and receptor surfaces was much better for thyroxine oriented with the 
phenolic group inward than with the amino acid moiety inward. In the latter case, 
it was not possible to position the iodines into well-defined pockets. 

We examined the docking into the prealbumin site of four thyroxine derivatives 
in which the phenolic ring was replaced by naphthol. The results were quite similar 
to those for thyroxine: three classes of structures were found for each 
derivative. These three classes were closely related to the "inverted", "normal" 
and "C-2" structures for thyroxine. Our calculations suggest that all four isomers 
can fit within the binding pocket. 



. The binding < 
Jf~(Blaney et al, 1 
■^derivative (W) 
1* the 7-OH grou 
1*(Fig- 8), forcinf 
• To approach 
the following 
beyond the vai 
[ receptor must 1 
! of a ligand ate 
spherical shell: 
the surface poi 
(overlap point 
three thyroxii 
acid in", an oi 
results. The o 
7-OH naphth 
encouraging i 
this type req 
algorithms tr 
of fit"- 



A. Docking geoi 



Fig. 7. Surface for hormone binding pocket in prealbumin (blue); the surface and molecular 
framework for thyroxine (purple) in its reference position ; the surface for thyroxine (green) in its best 
"docked" position. The blue sphere in the lower right corner of the pocket is a water molecule. 
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The binding constants of these compounds to prealbumin have been measured 
(Blaney et al., 1982a; and Table 5). Visual inspection showed the 3'-Br-, naphthol 
derivative (W) made a better fit than any of'the other naphthol derivatives while 
the 7-OH group of the Z analog caused particularly unfavorable displacements 
(Fig. 8), forcing the 1-3 and 1-5 iodines out of their receptor pockets. 

To approach the question of "goodness of fit" more quantitatively, we developed 
the following scheme. Imagine a thin spherical shell (025 A thick) extending 
beyond the van der Waals' sphere for each ligand atom. Each surface point of the 
receptor must lie in one of the following classes : ( 1 ) inside the van der Waals' sphere 
of a ligand atom, (2) inside the spherical shell of a ligand atom, or (3) outside the 
spherical shells of all ligand atoms. We define a "fit" parameter as the number of 
the surface points in category 2 (shell points) minus the surface points in category 1 
(overlap points). The fit value provides two useful discriminations. It ranks the 
three thyroxine structural classes in the order: "phenolic in" > "C-2" > "amino 
acid in", an ordering that is intuitively reasonable and in agreement with the X-ray 
results. The ordering for the thyroxine analogs isT4 >W >Z >Y >X. While the 
7-OH naphthol analog is placed too high on this list, the overall agreement is 
encouraging for so simple an approach. We recognize that serious comparisons of 
this type require more detailed calculations, but it is not difficult to construct 
algorithms that approximate the same quantity that the eye detects as "goodness 
of fit". 

Table 5 

Docking thyroxine and derivatives to prealbumin 



A. Docking geometries for thyroxine 



Classf 


Geometry 


Overlap}: 


7j ., 


Fit|| 


1 

2 


C-2 related to class 2 


2 
23 
. 31 


. 8 03 

0- 45 

1- 60 


62 
172 
158 


B. Comparison 


of thyroxine and derivatives in class 2 docking geometry 






Compound 


Relative binding 
affinity *[f 


OverlapJ 


Ta ., 


Fitll 


T4 
W 
Y 
X 
Z 


1-00 
0-29 
0-016 
0-063 
0-007 


23 
17 
8 
7 
12 


0-45 
1 11 

0- 90 
114 

1- 29 


172 
166 
111 
107 
133 



■ .--ee the text. 

t Defined for eqn (1), averages for best 5 structures (in arbitrary units). 

§ r.m.s. deviation per atom compared to starting conformation, best 5 structures. 

II See the text, average for best 5 structures (in arbitrary units). 

% Fraction of the apparent association constant of the derivative to that of T4. Data from Blanev 
et al. (1982a). 
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Fig. 8. Surface for thyroxine binding pocket and thyroxine as described for Fig. 7. The red surface is 
the best docked position for the 7-OH naphthyl analog (Z) (see the text). 

4. Discussion and Conclusions 

The main objective of this work was to design and test a set of algorithms that 
can explore in a reasonably complete manner the geometrically feasible alignments 
of a ligand and receptor of known rigid structure. This limited goal appears to be 
met by the program we have described above. The results for both test systems are: 

(1) Structures quite near the "correct" structures are readily recovered and 
identified as feasible solutions. 

(2) Other families of structures are found that are geometrically reasonable and 
that can be tested by simple scoring schemes, chemical intuition, or visual 
inspection with computer graphics. 

The important underlying assumptions that made these calculations feasible 
within a few hours of minicomputer time should be restated explicitly. 

(1) Both ligand and receptor structures were known in advance and were 
assumed to be rigid. The binding site of the ligand need not be known in 
advance, but was taken as given for prealbumin and readily selected by the 
clustering algorithm for myoglobin. 

(2) The center of the ligand and the center of the receptor pocket were assumed 
to be approximately coincident. 

(3) The numerical parameters used to select the spheres, to match distances, and 
to discard matches and structures certainly influence the total time. The 
parameters were chosen to be the same for both test cases, but further 
experiment is needed to establish their range of application. 



(a) Limitations 

We summarize the limitations of the program in its present form. 
(1) There is no simple way to overcome the need for known ligand and receptor 
structures. Molecular mechanics can provide a sampling of the favorable 
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conformations of ligands of reasonable size, but the need for a detailed receptor 
geometry will remain a severe limit. There are techniques in various stages of 
development that build up a picture of the receptor geometry from ligand binding 
da 1 :.; (Crippen, 1980; Marshall et al., 1978; Cramer, 1980) and these may offer useful 
starting points for our procedures. 

(2) Without allowing molecular flexibility, many aspects of ligand-receptor 
interactions are not properly described. While the present program could explore 
the docking of several specific conformers of the ligand, of more interest would be 
changes in the matching and/or refinement packages to permit distortions of the 
rigid geometry. As noted earlier, one way to do this would be to use the structures 
generated here as the starting point for energy minimization or molecular dynamics 
calculations. 

(3) This paper assumes that the ligand and the receptor pocket are of roughly the 
same size. Two other situations are commonly found. First, there are examples of 
large binding sites that are incompletely filled by the ligand, such as that for 
trimethoprim in dihydrofolate reductase (Baker et al., 1981). We anticipate that 
this can be handled by the existing program, although the time needed may 
increase dramatically. The more difficult situation is the docking of a portion of a 
ligand into a site. A good example would be the docking of trypsin inhibitor with 
trypsin or, more generally, the matching of any two macromolecular surfaces. The 
program as it now stands might well be swamped by the large number of 
extraneous internal distances from portions of the ligand that were not involved in 
the active interface. What is needed is a way to break a large ligand up into a 
number of "projections" in analogy to the separation of the receptor surface into a 
number of binding "concavities". There are a number of ways this could be done. 

(4) The relationship between the manipulations and simple scoring schemes used 
here, and energy optimization techniques needs to be established. A major concern 
is the use of united atoms instead of explicit hydrogen atoms. The surfaces will 
certainly be modified somewhat when hydrogen atoms are introduced. Further, the 
local energy terms will be altered significantly. 



(b) Other applications 
Some of the procedures have applications beyond their use here. For example, 
the various pockets, and packing defects for protein surfaces can be identified and 
examined in a systematic manner as suggested for myoglobin. The matching 
algorithm may prove useful for identification of common geometric features in a set 
of compounds. Finally, the spherical representation of the receptor pocket provides 
a tool with which to explore ligand modification. 

Discussions with Y. Martin stimulated this project. M. Connolly. F. Cohen. P. Koliman 
and P. Weiner provided useful suggestions. S. J. Oatley is a Mr and Mrs John Jaffe 
Donation Research Fellow of the Royal Society. Funding from the National Institute of 
Health (GM-19267, I. D. Kuntz. RR-1081. R. Langridge) is gratefully acknowledged. 
J.M.B. is supported in part by the American Foundation for Pharmaceutical Education. 
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